\chapter{Determinant}
\label{ch:determinant}
The goal of this chapter is to give the basis-free
definition of the determinant:
that is, we're going to define $\det T$
for $T \colon V \to V$ without making reference to the encoding for $T$.
This will make it obvious that the determinant of a matrix
does not depend on the choice of basis,
and that several properties are vacuously true
(e.g.\ that the determinant is multiplicative).

The determinant is only defined for finite-dimensional
vector spaces, so if you want you can restrict
your attention to finite-dimensional vector spaces for this chapter.
On the other hand we do not need the
ground field to be algebraically closed.

\section{Wedge product}
\prototype{$\bigwedge^2(\RR^2)$ gives parallelograms.}
We're now going to define something called the wedge product.
It will look at first like the tensor product $V \otimes V$,
but we'll have one extra relation.

For simplicity, I'll first define the wedge product $\bigwedge^2(V)$.
But we will later replace $2$ with any $n$.

\begin{definition}
	Let $V$ be a $k$-vector space.
	The $2$-wedge product $\bigwedge^2(V)$ is the abelian group
	generated by elements of the form $v \wedge w$ (where $v,w \in V$),
	subject to the same relations
	\begin{align*}
		(v_1 + v_2) \wedge w &= v_1 \wedge w + v_2 \wedge w \\
		v \wedge (w_1 + w_2) &= v \wedge w_1 + v \wedge w_2 \\
		(c \cdot v) \wedge w &= v \wedge (c \cdot w)
	\end{align*}
	plus two additional relations:
	\[ v \wedge v = 0 \quad\text{and}\quad
		v \wedge w = - w \wedge v. \]
	As a vector space, its action is given by
	$c \cdot (v \wedge w) = (c \cdot v) \wedge w = v \wedge (c \cdot w)$.
\end{definition}
\begin{exercise}
	Show that the condition $v \wedge w = - (w \wedge v)$
	is actually extraneous:
	you can derive it from the fact that $v \wedge v = 0$.
	(Hint: expand $(v + w) \wedge (v + w) = 0$.)
\end{exercise}

This looks almost exactly the same as the definition for a tensor product,
with two subtle differences.
The first is that we only have $V$ now, rather than $V$ and $W$
as with the tensor product.\footnote{So maybe the wedge product
	might be more accurately called the ``wedge power''!}
Secondly, there is a new \emph{mysterious} relation
\[ v \wedge v = 0 \implies v \wedge w = - (w \wedge v). \]
What's that doing there?
It seems kind of weird.

I'll give you a hint.
\begin{example}
	[Wedge product explicit computation]
	Let $V = \RR^2$, and let $v = ae_1 + be_2$, $w = ce_1 + de_2$.
	Now let's compute $v \wedge w$ in $\bigwedge^2(V)$.
	\begin{align*}
		v \wedge w &= (ae_1 + be_2) \wedge (ce_1 + de_2) \\
		&= ac (e_1 \wedge e_1) + bd (e_2 \wedge e_2)
		+ ad (e_1 \wedge e_2) + bc (e_2 \wedge e_1) \\
		&= ad (e_1 \wedge e_2) + bc (e_2 \wedge e_1) \\
		&= (ad-bc) (e_1 \wedge e_2).
	\end{align*}
\end{example}

What is $ad-bc$? You might already recognize it:
\begin{itemize}
	\ii You might know that the area of the parallelogram
	formed by $v$ and $w$ is $ad-bc$.
	\ii You might recognize it as the determinant of
	$ \begin{bmatrix} a & c \\ b & d \end{bmatrix}$.
	In fact, you might even know that the determinant
	is meant to interpret hypervolumes.
\end{itemize}

\begin{center}
\begin{asy}
	pair v = (4,1);
	pair w = (2,3);
	dot("$0$", origin, dir(225));
	dot("$v = ae_1 + be_2$", v, dir(-45), red);
	dot("$w = ce_1 + de_2$", w, dir(135), red);
	dot("$v+w$", v+w, dir(45));
	label("$ad-bc$", (v+w)/2, blue);
	fill(origin--v--(v+w)--w--cycle, opacity(0.1)+lightcyan);
	draw(origin--v, EndArrow, Margins);
	draw(origin--w, EndArrow, Margins);
	draw(v--(v+w), EndArrow, Margins);
	draw(w--(v+w), EndArrow, Margins);
\end{asy}
\end{center}

This is absolutely no coincidence.
The wedge product is designed to interpret signed areas.
That is, $v \wedge w$ is meant to interpret the area of the parallelogram
formed by $v$ and $w$.
You can see why the condition $(cv) \wedge w = v \wedge (cw)$ would make sense now.
And now of course you know why $v \wedge v$ ought to be zero:
it's an area zero parallelogram!

The \textbf{miracle of wedge products} is that the only additional condition
we need to add to the tensor product axioms is that $v \wedge v = 0$.
Then suddenly, the wedge will do all our work of interpreting volumes for us.

\begin{remark*}
	[Side digression on definitions in mathematics]
	This ``property-based'' philosophy is a common trope in modern mathematics.
	You have some intuition about an object you wish to define,
	and then you write down a wishlist of properties that ``should'' follow.
	But then it turns out the properties are sufficient to work with,
	and so for the definition, you just define an abstract object
	satisfying all the properties on your wishlist.
	Thereafter the intuition plays no ``official'' role;
	it serves only as cheerleading motivation for the wishlist.

	For wedge products,
	the wishlist has only the single property $v \wedge v = 0$.
\end{remark*}

In analog to earlier:
\begin{proposition}
	[Basis of $\bigwedge^2(V)$]
	Let $V$ be a vector space
	with basis $e_1$, \dots, $e_n$.
	Then a basis of $\bigwedge^2(V)$ is
	\[ e_i \wedge e_j \]
	where $i < j$.
	Hence $\bigwedge^2(V)$ has dimension $\binom n2$.
\end{proposition}
\begin{proof}
	Surprisingly slippery, and also omitted.
	(You can derive it from the corresponding theorem on tensor products.)
\end{proof}

Now I have the courage to define a multi-dimensional wedge product.
It's just the same thing with more wedges.
\begin{definition}
	Let $V$ be a vector space and $m$ a positive integer.
	The space $\bigwedge^m(V)$ is generated by wedges of the form
	\[ v_1 \wedge v_2 \wedge \dots \wedge v_m \]
	subject to relations
	\begin{align*}
		\dots \wedge (v_1+v_2) \wedge \dots
			&= (\dots \wedge v_1 \wedge \dots)
			 + (\dots \wedge v_2 \wedge \dots) \\
		\dots \wedge (cv_1) \wedge v_2 \wedge \dots
			&= \dots \wedge v_1 \wedge (cv_2) \wedge \dots  \\
		\dots \wedge v \wedge v \wedge \dots &= 0 \\
		\dots \wedge v \wedge w \wedge \dots &=
			- (\dots \wedge w \wedge v \wedge \dots)
	\end{align*}
	As a vector space
	\[ c \cdot (v_1 \wedge v_2 \wedge \dots \wedge v_m)
	 = (cv_1) \wedge v_2 \wedge \dots \wedge v_m
	 = v_1 \wedge (cv_2) \wedge \dots \wedge v_m
	 = \dots .
	\]
\end{definition}
This definition is pretty wordy, but in English the three conditions say
\begin{itemize}
	\ii We should be able to add products like before,
	\ii You can put constants onto any of the $m$ components
	(as is directly pointed out in the ``vector space'' action), and
	\ii Switching any two \emph{adjacent} wedges negates the whole wedge.
\end{itemize}
So this is the natural generalization of $\bigwedge^2(V)$.
You can convince yourself that any element of the form
\[ \dots \wedge v \wedge \dots \wedge v \wedge \dots \]
should still be zero.

Just like $e_1 \wedge e_2$ was a basis earlier, we can find the basis
for general $m$ and $n$.
\begin{proposition}[Basis of the wedge product]
	Let $V$ be a vector space with basis $e_1, \dots, e_n$.
	A basis for $\bigwedge^m(V)$ consists of the elements
	\[ e_{i_1} \wedge e_{i_2} \wedge \dots \wedge e_{i_m} \]
	where
	\[ 1 \le i_1 < i_2 < \dots < i_m \le n. \]
	Hence $\bigwedge^m(V)$ has dimension $\binom nm$.
\end{proposition}
\begin{proof}[Sketch of proof]
	We knew earlier that $e_{i_1} \otimes \dots \otimes e_{i_m}$
	was a basis for the tensor product.
	Here we have the additional property that (a)
	if two basis elements re-appear then the whole thing becomes zero,
	thus we should assume the $i$'s are all distinct;
	and (b) we can shuffle around elements,
	and so we arbitrarily decide to put the basis elements
	in increasing order.
\end{proof}

\section{The determinant}
\prototype{$(ae_1+be_2)\wedge(ce_1+de_2) = (ad-bc)(e_1\wedge e_2)$.}

Now we're ready to define the determinant.
Suppose $T \colon V \to V$ is a square matrix.
We claim that the map $\bigwedge^m(V) \to \bigwedge^m(V)$ given on wedges by
\[ v_1 \wedge v_2 \wedge \dots \wedge v_m
	\mapsto T(v_1) \wedge T(v_2) \wedge \dots \wedge T(v_m). \]
and extending linearly to all of $\bigwedge^m(V)$ is a
well-defined  linear map
(Here ``well-defined'' means that equivalent elements of the domain
get mapped to equivalent elements of the codomain.
This, and linearity, both follow from $T$ being a linear map.)
We call that map $\bigwedge^m(T)$.
\begin{example}
	[Example of $\bigwedge^m(T)$]
	In $V = \RR^4$ with standard basis $e_1$, $e_2$, $e_3$, $e_4$,
	let $T(e_1) = e_2$, $T(e_2) = 2e_3$, $T(e_3) = e_3$ and $T(e_4) = 2e_2 + e_3$.
	Then, for example, $\bigwedge^2(T)$ sends
	\begin{align*}
		(e_1 \wedge e_2) + (e_3 \wedge e_4)
		&\mapsto T(e_1) \wedge T(e_2) + T(e_3) \wedge T(e_4) \\
		&= e_2 \wedge 2e_3 + e_3 \wedge (2e_2 + e_3) \\
		&= 2(e_2 \wedge e_3 + e_3 \wedge e_2) \\
		&= 0.
	\end{align*}
\end{example}

Now here's something interesting.
Suppose $V$ has dimension $n$, and let $m=n$.
Then $\bigwedge^n(V)$ has dimension $\binom nn = 1$ --- it's a one dimensional space!
Hence $\bigwedge^n(V) \cong k$.

So $\bigwedge^n(T)$ can be thought of as a linear map from $k$ to $k$.
But we know that \emph{a linear map from $k$ to $k$ is just multiplication by a constant}.
Hence $\bigwedge^n(T)$ is multiplication by some constant.
\begin{definition}
	Let $T \colon V \to V$, where $V$ is an $n$-dimensional vector space.
	Then $\bigwedge^n(T)$ is multiplication by a constant $c$;
	we define the \vocab{determinant} of $T$ as $c = \det T$.
\end{definition}

\begin{example}[The determinant of a $2 \times 2$ matrix]
	Let $V = \RR^2$ again with basis $e_1$ and $e_2$.
	Let
	\[ T = \begin{bmatrix}
			a & c \\ b & d
		\end{bmatrix}.
	\]
	In other words, $T(e_1) = ae_1 + be_2$
	and $T(e_2) = ce_1 + de_2$.

	Now let's consider $\bigwedge^2(V)$.
	It has a basis $e_1 \wedge e_2$.
	Now $\bigwedge^2(T)$ sends it to
	\[ e_1 \wedge e_2 \xmapsto{\bigwedge^2(T)}
		T(e_1) \wedge T(e_2) =
		(ae_1 + be_2) \wedge (ce_1 + de_2)
		= (ad-bc)(e_1 \wedge e_2).
	\]
	So $\bigwedge^2(T) \colon \bigwedge^2(V) \to \bigwedge^2(V)$
	is multiplication by $\det T = ad-bc$,
	because it sent $e_1 \wedge e_2$ to
	$(ad-bc)(e_1 \wedge e_2)$.
\end{example}
And that is the definition of a determinant.
Once again, since we defined it in terms of $\bigwedge^n(T)$,
this definition is totally independent of the choice of basis.
In other words, the determinant can be defined based on $T \colon V \to V$ alone
without any reference to matrices.

\begin{ques}
	Why does $\bigwedge^n(S \circ T) = \bigwedge^n(S) \circ \bigwedge^n(T)$?
\end{ques}
In this way, we also get \[ \det(S \circ T) = \det(S) \det(T) \] for free.

More generally if we replace $2$ by $n$,
an write out the result of expanding
\[ \left( a_{11}e_1 + a_{21}e_2 + \cdots \right) \wedge \dots \wedge
	\left( a_{1n}e_1 + a_{2n}e_2 + \dots + a_{nn} e_n \right) \]
then you will get the formula
\[ \det(A) = \sum_{\sigma \in S_n} \opname{sgn}(\sigma)
	a_{1, \sigma(1)} a_{2, \sigma(2)} \dots a_{n, \sigma(n)} \]
called the \vocab{Leibniz formula} for determinants.
American high school students will recognize it;
this is (unfortunately) taught as the definition of the determinant,
rather than a corollary of the better definition using wedge products.

\begin{exercise}
	Verify that expanding the wedge product
	yields the Leibniz formula for $n=3$.
\end{exercise}

\section{Characteristic polynomials, and Cayley-Hamilton}
Let's connect with the theory of eigenvalues.
Take a map $T \colon V \to V$, where $V$ is $n$-dimensional
over an algebraically closed field,
and suppose its eigenvalues
are $\lambda_1$, $\lambda_2$, \dots, $\lambda_n$ (with repetition).
Then the \vocab{characteristic polynomial} is given by
\[
	p_T(X) = (X-\lambda_1)(X-\lambda_2) \dots (X-\lambda_n).
\]
Note that if we've written $T$ in Jordan form, that is,
\[
	T = \begin{bmatrix}
		\lambda_1 & \ast & 0 & \dots & 0 \\
		0 & \lambda_2 & \ast & \dots & 0 \\
		0 & 0 & \lambda_3 & \dots & 0 \\
		\vdots & \vdots & \vdots & \ddots & \vdots \\
		0 & 0 & 0 & \dots & \lambda_n
	\end{bmatrix}
\]
(here each $\ast$ is either $0$ or $1$),
then we can hack together the definition
\[
	p_T(X) \defeq
	\det \left( X \cdot \id_n - T \right)
	= \det \begin{bmatrix}
		X - \lambda_1 & \ast & 0 & \dots & 0 \\
		0 & X - \lambda_2 & \ast & \dots & 0 \\
		0 & 0 & X - \lambda_3 & \dots & 0 \\
		\vdots & \vdots & \vdots & \ddots & \vdots \\
		0 & 0 & 0 & \dots & X - \lambda_n
	\end{bmatrix}.
\]
The latter definition is what you'll see in most
linear algebra books because it lets you define the characteristic polynomial
without mentioning the word ``eigenvalue''
(i.e.\ entirely in terms of arrays of numbers).
I'll admit it does have the merit that it means that given any matrix,
it's easy to compute the characteristic polynomial and hence
compute the eigenvalues;
but I still think the definition should be done in terms of
eigenvalues to begin with.
For instance the determinant definition obscures the following theorem,
which is actually a complete triviality.
\begin{theorem}[Cayley-Hamilton]
	Let $T \colon V \to V$ be a map of finite-dimensional
	vector spaces over an algebraically closed field.
	Then for any $T \colon V \to V$,
	the map $p_T(T)$ is the zero map.
\end{theorem}
Here, by $p_T(T)$ we mean that if
\[ p_T(X) = X^n + c_{n-1} X^{n-1} + \dots + c_0 \]
then \[ p_T(T) = T^n + c_{n-1} T^{n-1} + \dots + c_1 T +  c_0 I \]
is the zero map,
where $T^k$ denotes $T$ applied $k$ times.
We saw this concept already when we proved
that $T$ had at least one nonzero eigenvector.

\begin{example}[Example of Cayley-Hamilton using determinant definition]
	Suppose $T = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$.
	Using the determinant definition of characteristic polynomial,
	we find that $p_T(X) = (X-1)(X-4)-(-2)(-3) = X^2 - 5X - 2$.
	Indeed, you can verify that
	\[ T^2 - 5T - 2
		= \begin{bmatrix}
			7 & 10 \\
			15 & 22
		\end{bmatrix}
		- 5 \cdot \begin{bmatrix}
			1 & 2 \\
			3 & 4
		\end{bmatrix}
		- 2 \cdot \begin{bmatrix}
			1 & 0 \\
			0 & 1
		\end{bmatrix}
		= \begin{bmatrix}
			0 & 0 \\
			0 & 0
		\end{bmatrix}.
	\]
\end{example}
If you define $p_T$ without the word eigenvalue,
and adopt the evil view that matrices are arrays of numbers,
then this looks like a complete miracle.
(Indeed, just look at the terrible proofs on Wikipedia.)

But if you use the abstract viewpoint of $T$ as a linear map,
then the theorem is almost obvious:
\begin{proof}[Proof of Cayley-Hamilton]
	Suppose we write $V$ in Jordan normal form as
	\[ V = J_1 \oplus \dots \oplus J_m \]
	where $J_i$ has eigenvalue $\lambda_i$ and dimension $d_i$.
	By definition,
	\[ p_T(T) = (T - \lambda_1)^{d_1} (T - \lambda_2)^{d_2}
		\dots (T - \lambda_m)^{d_m}. \]
	By definition, $(T - \lambda_1)^{d_1}$ is the zero map on $J_1$.
	So $p_T(T)$ is zero on $J_1$.
	Similarly it's zero on each of the other $J_i$'s --- end of story.
\end{proof}
\begin{remark}
	[Tensoring up]
	The Cayley-Hamilton theorem holds without the hypothesis that
	$k$ is algebraically closed:
	because for example any real matrix can be regarded
	as a matrix with complex coefficients
	(a trick we've mentioned before).
	I'll briefly hint at how you can use tensor products to formalize this idea.

	Let's take the space $V = \RR^3$, with basis $e_1$, $e_2$, $e_3$.
	Thus objects in $V$ are of the form $r_1 e_1 + r_2 e_2 + r_3 e_3$
	where $r_1$, $r_2$, $r_3$ are real numbers.
	We want to consider essentially the same vector space,
	but with complex coefficients $z_i$ rather than real coefficients $r_i$.

	So here's what we do: view $\CC$ as a $\RR$-vector space
	(with basis $\{1,i\}$, say)
	and consider the \vocab{complexification}
	\[ V_\CC \defeq \CC \otimes_\RR V. \]
	Then you can check that our elements are actually of the form
	\[ z_1 \otimes e_1 + z_2 \otimes e_2 + z_3 \otimes e_3. \]
	Here, the tensor product is over $\RR$,
	so we have $z \otimes re_i = (zr) \otimes e_i$ for $r \in \RR$.
	Then $V_{\CC}$ can be thought as a three-dimensional vector space over $\CC$,
	with basis $1 \otimes e_i$ for $i \in \{1,2,3\}$.
	In this way, the tensor product lets us formalize the idea
	that we ``fuse on'' complex coefficients.

	If $T \colon V \to W$ is a map, then $T_\CC \colon V_\CC \to W_\CC$
	is just the map $z \otimes v \mapsto z \otimes T(v)$.
	You'll see this written sometimes as $T_\CC = \id \otimes T$.
	One can then apply theorems to $T_\CC$
	and try to deduce the corresponding results on $T$.
\end{remark}

\section\problemhead
\begin{problem}[Column operations]
	Show that for any real numbers $x_{ij}$ (here $1 \le i,j \le n$) we have
	\[
		\det
		\begin{bmatrix}
			x_{11} & x_{12} & \dots & x_{1n} \\
			x_{21} & x_{22} & \dots & x_{2n} \\
			\vdots & \vdots & \ddots & \vdots \\
			x_{n1} & x_{n2} & \dots & x_{nn} \\
		\end{bmatrix}
		=
		\det
		\begin{bmatrix}
			x_{11} + cx_{12} & x_{12} & \dots & x_{1n} \\
			x_{21} + cx_{22} & x_{22} & \dots & x_{2n} \\
			\vdots & \vdots & \ddots & \vdots \\
			x_{n1} + cx_{n2} & x_{n2} & \dots & x_{nn} \\
		\end{bmatrix}.
	\]
	\begin{hint}
		The point is that
		\[
			(v_1+cv_2) \wedge v_2 \dots \wedge v_n
			= v_1 \wedge v_2 \dots \wedge v_n
			+ c(v_2 \wedge v_2 \dots \wedge v_n)
		\]
		and the latter term is zero.
	\end{hint}
\end{problem}
\begin{problem}
	[Determinant is product of eigenvalues]
	Let $V$ be an $n$-dimensional vector space
	over an algebraically closed field $k$.
	Let $T \colon V \to V$ be a linear map with
	eigenvalues $\lambda_1$, $\lambda_2$, \dots, $\lambda_n$
	(counted with algebraic multiplicity).
	Show that $\det T = \lambda_1 \dots \lambda_n$.
	\begin{hint}
		You can either do this by writing $T$ in matrix form,
		or you can use the wedge definition of $\det T$
		with the basis given by Jordan form.
	\end{hint}
\end{problem}

\begin{problem}
	[Exponential matrix]
	Let $X$ be an $n \times n$ matrix with complex coefficients.
	We define the exponential map by
	\[ \exp(X) = 1 + X + \frac{X^2}{2!} + \frac{X^3}{3!} + \cdots \]
	(take it for granted that this converges to some $n \times n$ matrix).
	Prove that
	\[ \det(\exp(X)) = e^{\Tr X}. \]
	\begin{hint}
		This is actually immediate by taking any basis
		in which $X$ is upper-triangular!
	\end{hint}
\end{problem}


\begin{problem}
	[Extension to \Cref{prob:equal_dimension}]
	Let $T \colon V \to V$ be a map of finite-dimensional vector spaces.
	Prove that $T$ is an isomorphism
	if and only if $\det T \neq 0$.
	\begin{hint}
		You don't need eigenvalues (though they could work also).
		In one direction, recall that (by \Cref{prob:equal_dimension})
		we can replace ``isomorphism'' by ``injective''.
		In the other, if $T$ is an isomorphism,
		let $S$ be the inverse map and look at $\det(S \circ T)$.
	\end{hint}
	\begin{sol}
		Recall that (by \Cref{prob:equal_dimension})
		we can replace ``isomorphism'' by ``injective''.

		If $T(v) = 0$ for any nonzero $v$,
		then by taking a basis for which $e_1 = v$,
		we find $\bigwedge^n(T)$ will map $e_1 \wedge \dots$
		to $0 \wedge T(e_2) \wedge \dots = 0$,
		hence is the zero map, so $\det T = 0$.

		Conversely, if $T$ is an isomorphism,
		we let $S$ denote the inverse map.
		Then $1 = \det(\id) = \det(S \circ T) = \det S \det T$,
		so $\det T \neq 0$.
	\end{sol}
\end{problem}

\begin{problem}
	[Based on Sweden 2010]
	\onechili
	A herd of $1000$ cows of nonzero weight is given.
	Prove that we can remove one cow such that the remaining $999$ cows
	cannot be partitioned into two sets with equal sum of weights.
	\begin{hint}
		Consider $1000 \times 1000$ matrix $M$
		with entries $0$ on diagonal and $\pm 1$ off-diagonal.
		Mod $2$.
	\end{hint}
	\begin{sol}
		We proceed by contradiction.
		Let $v$ be a vector of length $1000$
		whose entries are weight of cows.
		Assume the existence of a matrix $M$ such that $Mv = 0$,
		with entries $0$ on diagonal and $\pm 1$ off-diagonal.
		But $\det M \pmod 2$ is equal to the number of derangements
		of $\{1, \dots, 1000\}$, which is odd.
		Thus $\det M$ is odd and in particular not zero,
		so $M$ is invertible.
		Thus $Mv = 0 \implies v = 0$, contradiction.
	\end{sol}
\end{problem}

\begin{problem}
	[Putnam 2015]
	\twochili
	Define $S$ to be the set of real matrices
	$\left(\begin{smallmatrix} a & b \\ c & d \end{smallmatrix}\right)$
	such that $a$, $b$, $c$, $d$ form
	an arithmetic progression in that order.
	Find all $M \in S$ such that for some integer $k > 1$, $M^k \in S$.
	\begin{hint}
		There is a family of solutions other than just $a=b=c=d$.

		One can solve the problem using Cayley-Hamilton.
		A more ``bare-hands'' approach is to
		show the matrix is invertible (unless $a=b=c=d$)
		and then diagonalize the matrix as
		$
		M =
		\begin{bmatrix} s & -q \\ -r & p \end{bmatrix}
		\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}
		\begin{bmatrix} p & q \\ r & s \end{bmatrix}
		=
		\begin{bmatrix}
			ps\lambda_1 - qr\lambda_2 & qs(\lambda_1-\lambda_2) \\
			pr(\lambda_2-\lambda_1) & ps\lambda_2 - qr\lambda_1
		\end{bmatrix}
		$.
	\end{hint}
	\begin{sol}
		The answer is
		\[ \begin{bmatrix} t&t\\t&t \end{bmatrix}
			\quad\text{and}\quad
			\begin{bmatrix} -3t&-t\\t&3t \end{bmatrix} \]
		for $t \in \RR$.
		These work by taking $k=3$.

		Now to see these are the only ones, consider an arithmetic matrix
		\[ M = \begin{bmatrix} a & a+e \\ a+2e & a+3e \end{bmatrix}. \]
		with $e \neq 0$.
		Its characteristic polynomial is $t^2 - (2a+3e)t - 2e^2$,
		with discriminant $(2a+3e)^2 + 8e^2$,
		so it has two distinct real roots; moreover, since $-2e^2 \le 0$
		either one of the roots is zero or they are of opposite signs.
		Now we can diagonalize $M$ by writing
		\[
			M =
			\begin{bmatrix} s & -q \\ -r & p \end{bmatrix}
			\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}
			\begin{bmatrix} p & q \\ r & s \end{bmatrix}
			=
			\begin{bmatrix}
				ps\lambda_1 - qr\lambda_2 & qs(\lambda_1-\lambda_2) \\
				pr(\lambda_2-\lambda_1) & ps\lambda_2 - qr\lambda_1
			\end{bmatrix}
		\]
		where $ps-qr=1$. By using the fact the diagonal entries have sum equalling
		the off-diagonal entries, we obtain that
		\[ (ps-qr)(\lambda_1+\lambda_2) = (qs-pr)(\lambda_1-\lambda_2)
			\implies qs-pr = \frac{\lambda_1+\lambda_2}{\lambda_1-\lambda_2}. \]
		Now if $M^k \in S$ too then the same calculation gives
		\[ qs-pr = \frac{\lambda_1^k+\lambda_2^k}{\lambda_1^k-\lambda_2^k}. \]
		Let $x = \lambda_1/\lambda_2 < 0$ (since $-2e^2 < 0$). We appropriately get
		\[ \frac{x+1}{x-1} = \frac{x^k+1}{x^k-1}
			\implies \frac{2}{x-1} = \frac{2}{x^k-1}
			\implies x = x^k \implies x = -1 \text{ or } x = 0 \]
		and $k$ odd. If $x=0$ we get $e=0$ and if $x=-1$ we get $2a+3e=0$,
		which gives the curve of solutions that we claimed.

		A slicker approach is by Cayley-Hamilton.
		Assume that $e \neq 0$, so $M$ has two distinct real eigenvalues as above.
		We have $M^k = cM + d\id$ for some constants $c$ and $d$
		(since $M$ satisfies some quadratic polynomial).
		Since $M \in S$, $M^k \in S$ we obtain $d=0$.
		Thus $M^k = cM$, so it follows the eigenvalues of $M$ are negatives of each other.
		That means $\Tr M = 0$, and the rest is clear.
	\end{sol}
\end{problem}

%\begin{problem}[USAMO 2008, edited]
%	At a certain mathematical conference,
%	every two mathematicians are either friends or strangers.
%	At mealtime, every participant eats in one of two large dining rooms.
%	Each mathematician insists upon eating in a room which contains an
%	even number of his or her friends.
%	Assuming that such a split exists, prove that the number of ways
%	that the mathematicians may be split between the two rooms is a power of two.
%	% https://www.artofproblemsolving.com/Forum/viewtopic.php?p=3338962#p3338962
%\end{problem}
\begin{problem}
	\twochili
	Let $V$ be a finite-dimensional vector space over $k$ and $T \colon V \to V$.
	Show that
	\[
		\det(a \cdot \id_V - T) =
		\sum_{n=0}^{\dim V} a^{\dim V-n} \cdot (-1)^n
		\Tr\left( \bigwedge^n(T) \right)
	\]
	where the trace is taken by viewing $\bigwedge^n(T) \colon \bigwedge^n(V) \to \bigwedge^n(V)$.
	\begin{hint}
		Take bases, and do a fairly long calculation.
	\end{hint}
	\begin{sol}
		\newcommand{\Fix}{\opname{Fix}}
		\newcommand{\NoFix}{\opname{NoFix}}
		Pick a basis $e_1, \dots, e_n$ of $V$.
		Let $T$ have matrix $(x_{ij})$, and let $m = \dim V$.
		Let $\delta_{ij}$ be the Kronecker delta.
		Also, let $\Fix(\sigma)$ denote the fixed points of a permutation $\sigma$
		and let $\NoFix(\sigma)$ denote the non-fixed points.

		Expanding then gives
		\begin{align*}
			&\qquad \det (a \cdot \id - T) \\
			&= \sum_{\sigma \in S_m} \left( \sign(\sigma)
			\cdot \prod_{i=1}^m \left( a \cdot \delta_{i \sigma(i)} - x_{i \sigma(i)} \right)\right) \\
			% ------------------------
			&=
			\sum_{s=0}^m
			\sum_{1 \le i_1 < \dots < i_s \le m}
			\sum_{\substack{\sigma \in S_m \\ \sigma \text{ fixes } i_k}}
			\left( \sign(\sigma)
			\cdot \prod_{i=1}^m \left( a \cdot \delta_{i \sigma(i)} - x_{i \sigma(i)} \right)\right) \\
			% ------------------------
			&=
			\sum_{s=0}^m
			\sum_{1 \le i_1 < \dots < i_s \le m}
			\sum_{\substack{\sigma \in S_m \\ \sigma \text{ fixes } (i_k)}}
			\left( \sign(\sigma)
			\cdot \prod_{i \notin (i_k)} -x_{i \sigma(i)}
			\prod_{i \in (i_k)}^n \left( a \cdot - x_{ii}
			\right)\right) \\
			% -----------------------
			&=
			\sum_{\sigma \in S_m}
			\left( \sign(\sigma)
			\cdot \prod_{i \in \NoFix(\sigma)} -x_{i \sigma(i)}
			\prod_{i \in \Fix{\sigma}} \left( a - x_{ii}
			\right)\right) \\
			% -----------------------
			&=
			\sum_{\sigma \in S_m}
			\left( \sign(\sigma)
			\cdot \left( \prod_{i \in \NoFix(\sigma)} -x_{i \sigma(i)} \right)
			\left( \sum_{t=0}^{\left\lvert \Fix(\sigma) \right\rvert}
			a^{\left\lvert \Fix(\sigma) \right\rvert - t} \cdot \sum_{i_1 < \dots < i_t \in \Fix(\sigma)}
			\prod_{k=1}^t -x_{i_k i_k} \right)
			\right) \\
			% -----------------------
			&=
			\sum_{\sigma \in S_m}
			\left( \sign(\sigma)
			\left( \sum_{t=0}^{\left\lvert \Fix(\sigma) \right\rvert}
			a^{m-t-\left\lvert \NoFix(\sigma) \right\rvert}
			\sum_{\substack{X \subseteq \{1, \dots, m\} \\ \NoFix(\sigma) \subseteq X \\ X \text{ has exactly $t$ fixed}}} \prod_{i \in X} -x_{i \sigma(i)}
			\right) \right) \\
			% -----------------------
			&=
			\sum_{n=0}^m
			a^{m-n}
			\left(
			\sum_{\sigma \in S_m}
			\sign(\sigma)
			\sum_{\substack{X \subseteq \{1, \dots, m\} \\ \NoFix(\sigma) \subseteq X \\ \left\lvert X \right\rvert = n} }
			\prod_{i \in X} -x_{i \sigma(i)}
			\right) \\
			% -----------------------
			&= \sum_{n=0}^m
			a^{m-n} (-1)^n
			\left(
			\sum_{\substack{X \subseteq \{1, \dots, m\} \\ \left\lvert X \right\rvert = n} }
			\sum_{\substack{\sigma \in S_m \\ \NoFix(\sigma) \subseteq X}}
			\sign(\sigma) \prod_{i \in X} x_{i \sigma(i)}
			\right).
		\end{align*}

		Hence it's the same to show that
		\[
			\sum_{\substack{X \subseteq \{1, \dots, m\} \\ \left\lvert X \right\rvert = n} }
			\sum_{\substack{\sigma \in S_m \\ \NoFix(\sigma) \subseteq X}}
			\sign(\sigma) \prod_{i \in X} x_{i \sigma(i)}
			= \Tr_{\bigwedge^n(V)} \left( \bigwedge^n(T) \right)
		\]
		holds for every $n$.

		We can expand the definition of trace as using basis elements as
		\begin{align*}
			\Tr\left( \bigwedge^n(T) \right)
			&= \sum_{1 \le i_1 < \dots < i_n \le m}
			\left( \bigwedge_{k=1}^n e_{i_k} \right)^\vee
			\left( \bigwedge^n(T) \left( \bigwedge_{k=1}^n e_{i_k} \right) \right) \\
			&= \sum_{1 \le i_1 < \dots < i_n \le m}
			\left( \bigwedge_{k=1}^n e_{i_k} \right)^\vee
			\left(  \bigwedge_{k=1}^n T(e_{i_k}) \right) \\
			&= \sum_{1 \le i_1 < \dots < i_n \le m}
			\left( \bigwedge_{k=1}^n e_{i_k} \right)^\vee
			\left(  \bigwedge_{k=1}^n
			\left( \sum_{j=1}^m x_{i_k j} e_j \right)
			\right) \\
			&= \sum_{1 \le i_1 < \dots < i_n \le m}
			\sum_{\pi \in S_n} \sign(\pi) \prod_{k=1}^n x_{i_{\pi(k)}k} \\
			&= \sum_{\substack{X \subseteq \{1,\dots,m\} \\ \left\lvert X \right\rvert = n}}
			\sum_{\pi \in S_X} \sign(\pi) \prod_{i \in X} x_{t \pi(t)}
		\end{align*}
		Hence it remains to show that the permutations over $X$
		are in bijection with the permutations over $S_m$ which fix $\{1, \dots, m\} - X$,
		which is clear, and moreover, the signs clearly coincide.
	\end{sol}
\end{problem}

\begin{problem}
	[Cauchy-Binet formula]
	\onechili
	Let $n \ge s \ge 1$ be integers,
	and let $A$ and $B$ be an $s \times n$ matrix and $n \times s$ matrix, respectively.
	(Hence $AB$ is an $s \times s$ matrix.)
	For any subset $S \subseteq \{1, 2, \dots, n\}$ with $|S| = s$,
	we let $A_S$ be the $s \times s$ submatrix of $A$ of the rows with indices in $S$
	and let $B_S$ be the $s \times s$ submatrix of $B$ of the columns with indices in $S$.
	Prove that
	\[ \det(AB) = \sum_{|S| = s} \det A_S \det B_S. \]
	\begin{sol}
		See \url{https://www3.nd.edu/~andyp/notes/CauchyBinet.pdf}.
	\end{sol}
\end{problem}
